Updating Jenkins slaves
Rolling out new slaves
TODO - update templates - create templates - spawn new slaves - decomission old slaves - tags - how to test if it worked safely
Updating existing slaves
Some minor maintenance work does not require rollout of new slaves. An example might be an upgrade of some library or other dependency. In this case, one can update the running cluster without having to create a new template and rolling out every machine. However, the template must be updated at the end of the procedure, so that future fresh machines will also have the required changes.
The upgrade procedure consists of: 1. Manually upgrade a subset of nodes 1. Test if CI job pass on the new set of nodes 1. Upgrade the rest of the nodes
1. Manually upgrade a subset of nodes
- Select one Jenkins slave to use.
- Make sure that there are other slaves that perform the same function (they have the same tags and are up).
- In Jenkins UI, press
Mark this node as temporarily offline
, providing the reason. - Note the IP address of the slave.
-
Manually run ansible against the node.
NOTE: this is a one-off role. It doesn't necessarily need to be commited to version control. Therefore, any "hacks" are allowed. For example, to upgrade a chocolatey package, in addition to updating the
version
parameter ofwin_chocolatey
role, one must also run an additional task of uninstalling the package first. See examples below.- Checkout
github.com/Juniper/contrail-windows-ci
repository. - Enter
ansible/
directory. - Prepare your environment by going through steps described in
README.md
. -
Find the ansible role that is responsible for the upgrade.
- Modify the role to suit your needs.
-
If modification could cause a slave restart, manually disable auto start of
jenkins_swarm_client
:powershell Stop-Service jenkins_swarm_client Set-Service jenkins_swarm_client -StartupType Manual
-
NOTE:
jenkins_swarm_client
must be disabled, since a reboot removes theMark this node as temporarily offline
status from the slave. After the update procedure completes, you should re-enable this service to reconnect the builder to Jenkins. - Add a tag to every task that needs to be executed.
- Prepare inventory file.
- Edit file
inventory.devel/groups
. - Add IP of your selected slave under the correct group.
- Create one-off playbook.
- Create an yml file in the root of
ansible/
dir. - Specify hosts and roles to be executed.
- Run the playbook, specifying:
- ansible vault key location,
- inventory,
- task tags to execute,
- prepared playbook.
- Test if the change worked.
- Remote into the machine and verify if change was successful.
- Test if job passes.
- In Jenkins, go into the selected node view. Remove all tags it has. Add some temporary tag (e.g. 'temp-upgrade').
- Open a test pull request to
github.com/Juniper/contrail-windows-ci
- In its Jenksinfile, replace tags of nodes that you removed on t he node with the temporary tag.
- Open the pull request with "do not merge" in the title.
- Wait for CI to be triggered normally. Wait for it to pass.
- If autostart of
jenkins_swarm_client
was disabled:
powershell Set-Service jenkins_swarm_client -StartupType Automatic Start-Service jenkins_swarm_client
NOTE: the following steps may change depending on case-by-case basic.
- Checkout
-
Open a normal pull request with cleaned up changes to ansible roles.
- Get it merged.
- Create a new builder template. (see this)
- Rollout the change to other slaves by repeating steps 1-2 but specifying larger subsets of nodes.
Examples.
- Upgrade golang version.
ansible/roles/builder/tasks/main.yml:
...
- name: Uninstall golang
win_chocolatey:
name: golang
state: absent
tags: bump
- name: Install golang
win_chocolatey:
name: golang
version: 1.10.0
state: present
tags: bump
...
ansible/mysite.yml
---
- name: 'bump golang'
hosts:
- builder
roles:
- builder
inventory.devel/groups:
...
[builder]
10.84.12.87
...
Command to run:
ansible-playbook -i inventory.devel --vault-password-file ~/.ansible-vault --tags "bump" mysite.yml